Helper functions in select
Introduction
We have already covered basic select operation here. There are certain helper functions which will help select columns conveniently.
Some of the helper functions are:
- starts_with(…)
- ends_with(…)
- contains(…)
- matches(…)
- num_range(…)
Procedure
We will be working with the following custom dataframe.
# package for creating dataframe
library(tibble)
df <- tibble(col_end1 = as.integer(c(1,2,3)),
col_end2 = as.integer(c(4,5,6)),
col2 = c(5.0,6.3,9.7),
col3 = c("string1", "string2","string3"),
col4 = c(FALSE, TRUE, FALSE),
col5 = as.factor(c("A","B","A")),
start_col1 = as.integer(c(11,22,33)),
start_col2 = as.integer(c(44,53,64)),
col_contain_1 = as.integer(c(101,228,303)),
col_contain_2 = as.integer(c(443,335,664)),
)
View(df)
The first few rows are as follows:
We will explore the following helper functions:
- starts_with(<pattern>): It will select the columns that start with the pattern specified.
- ends_with(<pattern>): It will select the columns that end with the pattern specified.
- contains(<pattern>): It will select the columns that contain the pattern specified.
- matches(<regex>): It will select the columns that match the specified regex.
- num_range(<column_name>, <integer_range>): It will select the columns that satisfy the column_name+integer_range combinations.
Code
1. starts_with:
# refer procedure for definition of df
# select all columns that start with start
View(select(df, starts_with("start")))
The output of above code is:
Thus we have selected the columns that start with the pattern “start”.
2. ends_with:
# refer procedure for definition of df
# select all columns that end with end1, end2
View(select(df, ends_with("end1"),ends_with("end2")))
The output of above code is:
Thus we have selected the columns that end with the pattern “end1” or “end2”.
3. contains:
# refer procedure for definition of df
# select all columns that contain "contain"
View(select(df, contains("contain")))
The output of above code is:
Thus we have selected the columns that contain the pattern “contain”.
4. matches:
# refer procedure for definition of df
# select columns that match the regex "^col"
View(select(df, matches("^col")))
The output of above code is:
Thus we have selected the columns that match the regex pattern “^col”, which means starts with col.
5. num_range:
# refer procedure for definition of df
# select columns col2, col3
View(select(df, num_range("col", 2:3)))
The output of above code is:
Thus we have selected the columns that form combinations of col with numbers 2 and 3 i.e. col2 and col3.
Conclusion
Thus we have successfully explored helper function for select operation.
References
- https://r4ds.had.co.nz/